Spatial evolutionary prisoner's dilemma game with three strategies 

and external constraints 
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The emergency of mutual cooperation is studied in a spatially extended evolutionary prisoner's dilemma 
game in which the players are located on the sites of cubic lattices for dimensions d = 1, 2 and 3. Each player 
can choose one of the three following strategies: cooperation (C), defection (D) or Tit for Tat (T). During 
the evolutionary process the randomly chosen players adopt one of their neighboring strategies if the chosen 
neighbor has higher payoff. Moreover, an external constraint imposes that the players always cooperate 
with probability p. The stationary state phase diagram is computed by both using generalized mean-field 
approximations and Monte Carlo simulations. Nonequilibrium second order phase transitions associated with 
the extinction of one of the possible strategies are found and the corresponding critical exponents belong 
to the directed percolation universality class. It is shown that forcing externally the collaboration does not 
always produce the desired result. 

PACS numbers: 02.50.-r, 05.50.+q, 87.23.Cc 



I. INTRODUCTION 

Evolutionary game theory has attracted a lot atten- 
tion during the past years [^]j2j in human sciences, polit- 
ical sciences, biology and economics. In particular, the 
so-called evolutionary prisoner's dilemma game (PDG), 
which is a metaphor for the evolution of cooperation in 
populations of selfish individuals, has been minutely in- 
vestigated In the original form of the PDG, only 
uniform populations with given strategies were consid- 
ered. However, it was realized that new interesting 
phenomena can occur when the PDG was expanded in 
such a way that local contests in a d-dimensional space 
could take place (we shall use the abbreviation SPDG for 
such systems) . It turns out that these spatially extended 
models are similar to the ones studied in nonequilibrium 
statistical physics. They may exhibit cooperative behav- 
ior resulting in phase transitions in the stationary state. 
Accordingly, it is very fruitful to study SPDG like models 
using the tools developed in the framework of nonequi- 
librium statistical physics. 

In its simpler form, the PDG is a version of matrix 
games where the symmetric incomes of the two players 
depend on their simultaneous decisions, whether they 
wish to cooperate with the others or to defect. Each 
player wants to maximize his individual income. The 
highest individual payoff (the temptation to defect) can 
be reached by the defector against the cooperator re- 
ceiving the lowest reward (sucker's payoff). The mutual 
cooperation results in the highest total payoff divided 
equally between the players. For mutual defections the 
players get a lower payoff exceeding the value of sucker's 
payoff. Two rational players will both defect because 
this choice provides the larger income independently of 



the partner's choice. 

On the contrary, mutual cooperation dominates in eco- 
nomic and biological systems where the contestants in- 
teract frequently. In the iterated round-robin PDG, the 
players, knowing the previous decisions, have to choose 
between two options (defection and cooperation). For a 
given round the contestants can be classified according 
to the total individual payoffs they have obtained. Fol- 
lowing the Darwinian selection principle, at each round 
the worst player will adopt the winner's strategy. 

Extended numerical simulations have been performed 
to select the "best strategy" among many [Hj^]. Best 
does not means that this strategy will always win a fight 
against another strategy, but means that it will obtain 
the highest payoff during a tournament during which it 
will have to fight against many opponents having differ- 
ent strategies. In ||] the highest payoff was obtained by 
the so-called "Tit for Tat" (T) strategy which cooper- 
ates in the first round and then always repeats his co- 
player's previous decision. The main characteristics of 
this strategy (never defect first, react to the defection of 
the opponent and forgiveness) are crucial ingredients to 
sustain the mutual cooperation against the defectors. In 
particular, extensive simulations (see for a summary) 
for the case in which the players adopt one of the two 
following strategies: C (cooperate unconditionally) or D 
(defect unconditionally) have shown that the cooperators 
were disappearing in the stationary state. The introduc- 
tion of some T's strategies has an important effect. For 
short times, the D's population increases while the C's 
one decreases, leading to the decrease of D's payoffs. As 
a consequence, the T's can invade the D's population. 

In order to study the spatial effects Nowak and May 
have introduced an SPDG consisting of a two-state cel- 
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lular automaton. The players are located on a regular 
lattice in a d-dimensional space and can adopt the C's or 
D's strategy. Each player is fighting with the individuals 
belonging to a given neighborhood. The player's strate- 
gies are upgraded simultaneously in discrete time steps 
according to the following rule: each player adopts the 
best strategy found in his neighborhood. This model ex- 
hibits a rich variety of spatial and temporal patterns as a 
function of the payoff b characterizing the temptation to 
defect. Other SPDG models have also been investigated 
[p[-p^. In particular Killingback and Doebeli have 
shown that "Pavlov" like strategies can be even more 
efficient than "Tit for Tat" in some circumstances. 

Nowak et al. have also extended the above analysis 
by allowing stochasticity (irrational choices) during the 
evolutionary process. The degree of stochasticity is gov- 
erned by a parameter called m, and in the limit m — s- cxd, 
one recovers the deterministic case. According to the 
value of b, several stationary states are possible. 

In some related models Szabo and Toke have 

shown that the different stationary state phases were sep- 
arated by second order nonequilibrium phase transitions 
lines. The associated critical exponents belongs to the 
directed percolation (DP) universality class p^ , p^ . 

Both the importance of the presence of the T's in 
the PDG and the richness associated with the spatially 
extended aspect motivated us to study a new class of 
SPDG. In the present work, we study a novel aspect of 
the SPDG, namely what is happening when the cooper- 
ators are enforced by some external constraints. More 
explicitly, we consider a SPDG with the three strategies 
D, C, and T and investigate the effects of random adop- 
tion (or forcing) of C strategies. This new effect can be 
interpreted as an attempt by a government or by any 
other organization to enforce the cooperation among in- 
dividuals by forcing some of them, chosen randomly with 
a given probability p, to cooperate. A different interpre- 
tation can also be given. Among the player's community, 
some old players resign and are replaced by younger ones 
having a different educational background making them 
more open to collaborate. As we shall see the effects of 
this external constraint are rather surprising. Indeed, in 
dimensions 1,2 or 3, the presence of the constraint re- 
duces the cooperation if p is less than a given threshold 
value Pc depending upon the dimensionality of the sys- 
tem. The cooperation is enforced only if the constraint 
is strong enough, i.e. if p > pc. 

The nonequilibrium second order phase transitions de- 
scribing the extinction of one of the possible strategies are 
found to belong to the directed percolation universality 
class. 

The paper is organized as follows. The model is defined 
in section II. Its properties are analyzed in mean-field 
approximation in section III. The properties of the model 
without constraint are discussed in section IV. The model 
with constraint is studied in section V, both in mean-field 
approximation and by Monte Carlo (MC) simulations, for 
respectively 1, 2 and 3 dimensions. Finally, conclusions 



are drawn in section VI. 



II. THE MODEL 

We consider a SPDG model in which the players are 
located on the sites of a d-dimensional cubic lattice of 
linear size L where periodic boundary conditions are as- 
sumed. Each player can adopt one of the three following 
strategies D (defects), C (cooperates) or T (Tit for Tat) 
and interacts with its 2d nearest neighbors. The total 
payoff of a given player is the sum of the payoffs coming 
from the interaction with all its nearest neighbors. We 
use an extension (including the Tit for Tat strategy) of 
the payoff matrix used by Nowak and May ^ . The in- 
dividual payoffs for the players Pi and P2 as a function 
of their strategies are given in Table 1. The only free pa- 
rameter 6 (1 < 6 < 2) measures the temptation to defect. 
Note that the above payoff matrix does not take into ac- 
count that the T players always try to cooperate with 
D's during the first round. Thus these values are consid- 
ered as the averaged (or in this present case stabilized) 
payoffs. 
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Table 1 : Payoff matrix of the model 

It is legitimate to use this payoff matrix providing that 
the strategy adoption is rare comparing to the frequency 
of the game. It makes the simulation simpler and more 
efficient. Notice that similar payoff matrices can be ob- 
tained when substituting some other "nice" strategy (as 
Pavlov which cooperates in the first round for example, 
a strategy which is nevertheless less efficient than T in 
this case) |,| for T. 

For the non constrained case (p — 0), the system 
evolves in discrete time according to the following MC 
process. Starting from a random initial state, a site is 
chosen randomly. This site updates its strategy by first 
selecting randomly one of its nearest neighbor and sec- 
ond, by adopting the strategy of this nearest neighbor 
only if it is having a higher payoff. A MC steps consists 
in updating each lattice site once, on the average. 

In the constrained case, at each time step, each player 
is forced to adopt the cooperative strategy C with a prob- 
ability p. 

Thus the dynamics of the constrained model is the fol- 
lowing. 

• One chooses randomly one player. 

• With probability p this player adopts the C strat- 
egy- 
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• With probability {1 — p) the player searches for 
a better strategy according to the procedure de- 
scribed above. 

• The players update their payoffs. 

The model is characterized by three free parameters: 
b,p, d. 

It is hopeless to find exact analytical solutions for such 
models. Accordingly, we shall first study them in the 
framework of mean-field like approximations and then 
investigate them by numerical simulations. 

III. MEAN-FIELD LIKE APPROXIMATIONS 

The simplest mean-field approximation consists in ne- 
glecting all the spatial correlations in the systems. This 
amount to consider a model in which for each player, 
the partners are chosen randomly in the system instead 
of being restricted to a particular neighborhood. Each 
player interacts with the same number of counterparts. 
The dimensionality of the system plays no role. The sim- 
plest mean-field like approximations have been success- 
fully used previously for similar problems and more de- 
tails concerning this approximative scheme can be found 
in textbooks (||,|. 

Within this approximation, the dynamics of the system 
is completely described by the time dependent concentra- 
tions: 

CaW = (A^a(i))/i^ {a= C,D,T) (1) 

where Na (t) is the number of players with strategy a at 
time t. These concentrations satisfy the normalization 
condition cd + cc + ct = 1- 

According to Table 1 the average payoffs for each strat- 
egy are: 

mo = bcc , mc = cc + ct , tut ^ cc + ct ■ (2) 

Notice that the C and T strategies have the same payoffs, 
therefore no strategy exchange will occur among them. 

Following the evolutionary rules given in Sec. H, the 
concentrations Cq, (t) obey the following equations of mo- 
tion: 

CD = -pcD T (1 - p)cd{cc + Ct) , 

Cc = +p{cD + Ct) ± (1 - p)cdcc , (3) 

CT = -pcT ± (1 - p)cdCt , 

where the upper and lower signs refer respectively to the 
cases when strategy D is dominated by C and T ( itid < 
mc — rtiT), and when D dominates C and T (m^ > 
mc = mT). 

The numerical integration of the above equations of 
motions leads, for several values of p, to the flow diagrams 
shown in Fig. |l|. 



The quantities represented on the vertical and hori- 
zontal axes are respectively cd and cc — ct- The upper 
corner of the triangle corresponds to the state of c^i = I, 
while the lower left and right corners describe respec- 
tively homogeneous states with ct = 1 and cc = 1- 




-1 1 
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FIG. 1. Trajectories in the two-dimensional phase space 
for three different values of p as indicated in the figures. The 
dashed lines divide the phase spaces into two regions: on their 
left (right) hand side the payoff of the D strategy is lower 
(higher) than those of the C and T ones. 

For < p < 1/2 the stationary state solution of the 
above equations is: 

l-2p p 

CD ^ ; Cc ^ ; CT = (4) 

1 — p 1 — P 

while for p > 1/2 the system goes to the adsorbing state 
(cc — I and cd = ct = 0). Surprising the T strategy 
extincts if p > 0. 

Without constraints, i.e. for p — 0, the system tends 
either to a homogeneous D adsorbing state {cd — f ) or 
to a mixed state of C and T strategies {cd — 0) de- 
pending on the initial conditions. Note that the above 
stationary states are independent of the value of b. Strik- 
ingly different behaviors will be observed beyond this 
simplest mean-field approximation when the local fluc- 
tuations and short-range correlations are taken into ac- 
count as we shall see in the following sections. 

More elaborated mean-field like approximations can be 
devised. The basic idea is to take explicitly into account 
some of the spatial correlations by computing the prob- 
ability of appearance of all the possible configurations 
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of a small cluster containing n sites. In one dimension, 
one considers a cluster of n consecutive lattice sites. The 
equations of motion for these probabilities follows from 
the evolution rules of the system. Details concerning 
the one-dimensional case are given in Note that 

already the one site mean- field approximation (n = 1) 
differs from the simple mean-field solution given above. 
In higher dimensions pair or square mean-field like ap- 
proximations have been used, and a detailed description 
of these methods can be found in ||l^,|l^. Note that 
now, the predictions of these approximations depends 
upon the dimension of the system. Accordingly, the cases 
d = 1, 2 and 3 will be discussed in different subsections. 

IV. MODEL WITHOUT EXTERNAL 
CONSTRAINT 

Let us start by considering the case with no external 
constraint, i.e. p = Q. In this case, the dynamic is simple: 
a given player adopts the strategy of one of his randomly 
chosen neighbor providing that this neighbor has a higher 
payoff. As we shall see, the only stationary states are 
either always trivial cooperator like (C or C-T) or a pure 
defector one {D). 

When the random initial state is made of only coop- 
er ators and defectors, one founds that, both in = 1,2 
and 3 dimensions, the stationary state of the system is a 
pure defector one. 

The reason in d = 1 is simply that a D player has 
always a higher payoff than a neighboring C player. 

In d = 2 several configurations should be analyzed. 
The most favorable situation for C to win is when it is 
adjacent to a flat C-D interface. However, if this interface 
has an irregularity then the D's can invade the sea of C 
players. Indeed, a new born C in the D half-plane is 
always weaker than the D's at the interface and thus, 
such C"s disappears sooner or later. However, at a given 
time and with some finite probability, two next nearest 
neighbor C players could be present in the D half-plane. 
It results a payoff 36 for the D player squeezed between 
three C and this D can now invade the C's and sweep off 
one layer of them independently of the value of 6. 

The c? = 3 case is very similar, but the process is slower. 
Indeed, four nearest neighbors of a D player (called Di) 
are necessary to be invaded by the C's if the value of b is 
close to 1. Once it has occured, Di is strong enough to 
cross the interface and then destroys all the C's. 

For a large system with initially a finite proportion 
of T players, the stationary state is always a cooperator 
like state (T-C) . This asymptotic behavior can be easily 
understood in the one dimensional case. Let us suppose 
that there are four neighboring T (TTTT) in the initial 
state of a Id system, (this is practically always the case 
in a sufficiently large system with a finite probability to 
have some T's in the initial state), then they kill all of the 
D's. Even in the worst case two D's could invade into to 



the T population, (CDDTTDDC), but then the central 
T's become stronger (payoff=l) then the D's (payoff=0). 
The D's kill all of the neighboring C's, and even these 
two T's can invade the whole D area. 

This argument can be extended to d dimensions. A 
cube of linear size 4, made of T players is enough to 
guarantee the cooperation in the stationary state. The 
reason is that the Z^'s surrounding this cube cannot de- 
stroy the central T players inside a cube of linear size 2. 
Indeed, the central T's have always d T's neighbors to 
cooperate with (payoff=d), but the neighboring D's can 
only have one C neighbor (payoff< b < d ii d > 2). 

V. MODEL WITH EXTERNAL CONSTRAINT 

We now consider the case in which the players are lo- 
cated on the sites of a d-dimensional cubic lattice in the 
presence of an external constraint imposing to each player 
to choose the C strategy with probability p. The cases 
d ~ 1,2 and 3 will be considered. 

A. One-dimensional system 

In the one-dimensional model the players located on 
the sites of a chain interact with their two nearest neigh- 
bors. It is easy to see that the dynamics is independent 
of the value of the parameter b in its domain of definition. 

A systematic MC analysis of the stationary states was 
performed by varying the value of p for different system 
sizes between L = 32 to L = 16384. Our simulations 
show that the T's strategy extincts for all values of p. 
However, as a function of p, the stationary state can be 
either a symbiosis of C and D strategies or a pure C state 
as shown in Fig. |^. For very small values of p, one has a 
pure C stationary state (cc = 1) but, when p reaches the 
value pi , a first transition occurs to a stationary state in 
which the two strategies D and C coexist. At p = Pc2 > 
Pi the system undergoes a second continuous transition 
from the C-D stationary state to the pure absorbing C 
state. 

The transition at pc2 is easy to understand. In one 
dimension and when only C and D strategies are present 
our model is equivalent to the contact process (CP) 
which was originally introduced as a simple model for 
epidemics ||l^. In the CP a particle (sick person) can 
disappear at rate 1, and an empty site (healthy person) 
can become occupied with rate Az/2, where A is the con- 
trol parameter and z is the number of particles in the 
neighborhood of the empty site. In our model the D 
strategy, which in d = 1 is always better than the C 
one, plays the role of the sick persons and the C strat- 
egy, which can only be created in the system through the 
constraint, corresponds to the healthy individuals. 
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FIG. 2. The concentration of strategy D as a function of 
p in the one- dimensional system. The open squares are the 
MC data obtained for L = 16384. The solid lines represent 
the results of n-point approximations (n = 1, 7 from right 
to left). 

Starting from a C-D-T initial state and after the ex- 
tinction of T's the possible stationary states are the pure 
C or a C-D states. The C-D state corresponds to the 
steady state of the CP as well. For p around Pc2, the T's 
disappear rapidly and thus do not affect the extinction 
of the Cs. Hence the second transition to the absorbing 
state corresponds to the CP's one. This transition oc- 
curs at Pc2 = 1/(1 + Ac) = 0.23267 [|l8|] and is believed to 
belong to the DP universality class | p_9| , p0t . 




0.02 0.04 0.06 0.08 0.1 
P 

FIG. 3. The probability, Pl{p), of reaching the C-D state 
for different system sizes (from right to left L = 50, ICQ, 
200, 500, 1000, 2000, 5000, 10000 and the interpolation for 
L — oo). The dotted lines represent a fit described in the 
text. 

The behavior of the first transition at small p's is much 
less clear. It turns out that for our model two characteris- 
tic parameters Pq (L) aiidpp(L), depending on the system 
size L, can be introduced. For p < Pa{L) the stationary 
state is always the pure C state while for p > ppiL) 
the system evolves to a C-D coexistence phase, which is 
a steady state of the CP. For pi{L) < p < P2{L), the 
system can evolves towards one of the two possible sta- 



tionary states, depending on the particular realization of 
the random numbers and on the initial state. 




0.1 0.2 0.3 0.4 



{p-p»[Pl(p)])/l''^ 

FIG. 4. The probability p as a function of the scaled pa- 
rameter. The symbols represents the different system sizes as 
in Fig. |. 

The probability, pl{p), of reaching the C-D state has 
been investigated numerically. The data are plotted in 
Fig. I and can be well fitted by a function of the type 
1 — exp(— ci(p — poY^), where ci, C2 and po are fitting 
parameters. The limiting function Poo{p) was obtained 
by using usual finite size scaling methods. As shown in 
Fig. ^, the functions pl{p) collapse on a single curve if 
one plots p as a function of {p - p^[pl{p)]] / L^'^ ■ This 
shows that Poo{p) do not collapse to zero, hence the phase 
transition takes place at a finite value of p ~ 0.025. 

15000 r ■ ■ ■ ■ 1 



10000 

Q 

5000 - ' 

0.02 0.03 0.04 0.05 0.06 0.07 

P 

FIG. 5. The respective times tc {L = 2000(*)) and 
tcD {L = 200 (-I-), 500 (x), 2000 (□)) needed to reach 
the C and the C-D stationary states. For comparison, 
10000 X pi,=20oo(p) is also presented with dotted line. 

The respective times tc and tcD needed to reach the 
pure C and the C-D stationary states have also been 
investigated. Both times show a singular behavior (see 
Fig. 1^). Unfortunately, we were not able to find a rea- 
sonable scaling fit for those data. However, it is obvious 
from the simulations, that ten exhibits a much stronger 
singularity than tc- Hence, for a large system, the time 
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needed to reach the C-D state is much larger than the one 
needed to reach the C state, even if the system evolves 
to the C-D state more frequently. 

Beside the MC simulations, the properties of the sys- 
tem were also investigated using generalized mean-field 
approximations. Assuming the coexistence of all the 
three strategies these calculations can be performed nu- 
merically on clusters of sizes as large as n = 4. Contrary 
to the MC results (see Fig. ||) these approximations pre- 
dicted the existence of a C-D-T state at small p values. 
As far as the T's are concerned, we observed that the 
maximum value of ct is decreasing when the cluster sizes 
n were increasing, providing us a trend for the extinction 
of the T's. 

The contradiction between the present mean-field and 
MC results refers to the importance of (interfacial) inva- 
sion phenomena detailed later on. 

Allowing only two strategies (C and D) the above 
mean-field analysis can be performed for larger cluster 
sizes up to n = 7. The results are compared with the MC 
ones in Fig. |[ As expected, the accuracy of the general- 
ized mean-field method increases with the cluster size n. 
The extrapolation of the results obtained for finite values 
of n (n = 1, 7) leads to a critical value Pc^^^"* — 0.235. 
The quality of this approximative scheme can be esti- 
mated by comparing the value of p),2 with the best 
known numerical value Pc2 = 0.23267 p^ ]. 

In order to understand the behavior of the model 
around the first transition it is interesting to examine 
the time evolution of the system in its transient regime. 
As illustrated in Fig. ^, one can observe a domain growth 
process controlled by cyclic invasions This picture 

suggests that the most relevant aspect of the dynamics 
is the collision between the fronts separating different 
strategies. 




FIG. 6. Evolution of strategy distribution for p — 0.04 
starting from a random initial state. The fc-th row shows 
the positions of the D (closed squares), C (empty), and T 
(open circle) strategies at the time 2k measured in MC steps. 



Let us first investigate the motion of a front separat- 
ing a cluster of C from a D one. The D domains contain 
some C sites coming from the external constraint. How- 
ever, the life times of the C's are very short [~ (1 — p)]- 
Within a D domain the average C concentration is equal 
to that of the CP and can be well approximated by the 
simple mean-field approach as we have seen above. The 
D strategy invades the territory of the C one (absorbing 
state). Due to the reflection symmetry the invasion front 
can move to the left or to the right with the same abso- 
lute value of the velocity. To first order in p in the limit 
p —^ 0, the absolute value of the average invasion velocity 
can be estimated by using the configuration probabilities 
given for n = I. This calculation yields 



1 - 3p 



(5) 



Let us now consider the invasion of the T strategy into 
the D one. Inside a territory which has been invaded by 
the T strategy, the C strategy is setting up with a prob- 
ability p. As both C's and T's have the same payoff, no 
adaptation of strategies occur between them. Assuming 
that this invading front travels with a constant velocity 
M, the probability of having a C player at a site k steps 
behind the front is 1 - e-P^('=). T(fc) = {k + 1/2) /u is 
the averaged time it takes to the front to move over a 
distance of k. Thus, it follows that, to leading order in 
P, 



1 - Up 



(6) 



We note that above approximations lead to a velocity 
for the D ^ C front which is larger than the one for 
the T ^ D. This prediction can be compared with the 
results of the MC simulations (see Fig. For the D 
C front, the agreement is very good if p < 0.1, while for 
the T ^ D case, the approximation reproduces well only 
the linear part near the origin. 




0.00 0.02 0.04 0.06 0.08 0.10 
P 

FIG. 7. Velocities of D — > C (open squares) and T ^ D 
(diamonds) invasions as a function of p. The solid and dashed 
lines indicate the analytical predictions in linear approxima- 
tion. 
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Another consequence of the external constraint con- 
cerns the hfe-time of the T clusters. According to the 
above picture, the probability that a T cluster dyes out 
during the time evolution can be approximated by: 



Pt = f[[l exp(-p(fc + 1/2) /u) 



(7) 



k=Q 



The inverse of Pt can be interpreted as the average life 
time tt of an invading T cluster. Substituting an integral 
for the infinite sum appearing in the logarithm of the 
above expression leads to: 
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„ „ 2 1 ^ lip 

Tt — exp( j ~ exp(7r 



6p 



12p 



(8) 



This expression shows a fast increase of the life-time when 
p is decreasing. For example, a T cluster dyes out in 
about 10^ MC steps (MCS) if p = 0.04. This estimated 
life-time is significantly larger than those found by MC 
simulations [t^^'^^ (p = 0.04) « 800 MCS] as indicated 
in Fig. ^. The p-dependence of the MC data can be well 
approximated by the function tt = 3.26 exp(0.226/p) 
within the region 0.025 < p < 0.08 we could study. The 
large discrepancy between the mean-field and MC results 
refers to the enhanced role of the velocity fluctuations. 

Starting from a random initial state spatially sepa- 
rated, domains are rapidly formed. Then two different 
situations can occur. First, T clusters are present one 
both ends of each D clusters leading to a fast extinc- 
tion of _D's. Accordingly, the system evolves to a pure 
C state. Second, after some time the system reaches a 
state characterized by the presence of only one D cluster, 
having a T island at only one of its ends, in a sea of C. 
As the D ^ C invasion is faster than the T ^ D, the T's 
can never destroy the D's and due to the finite life-time 
of the T cluster the stationary state is a C-D coexisting 
one. The exponential increase of the life-time of this T 
cluster as p — > explains the singular behavior observed 
in tcD- 

For p < pi{L) the life-time of the T players are so long 
that all the D clusters are surrounded by T's. Accord- 
ingly, the first scenario described above is always present, 
leading to a pure C state. In the contrary, for p > P2(,L) 
the short lifetime of T's insures that the T's disappear 
rapidly, allowing for the growth of the D clusters. As a 
consequence the stationary state is a C-D one. 



B. Two-dimensional system 

The players are located on the sites of a two- 
dimensional square lattice. According to the payoff ma- 
trix (see Table 1), two ranges of values of b have now to 
be distinguished, namely, 1 < 6 < 3/2 and 3/2 < & < 2. 
For any value of h in one of those ranges the dynamics is 
the same. 
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FIG. 8. Stationary state densities of D (closed squares), C 
(open circles) and T (open squares) strategies as a function of 
p as obtained by MC simulation of a system of size 512 x 512 
for 3/2 < 6 < 2. The solid (D), dotted (C) and long-dashed 
(T) lines indicate the predictions of the square mean-field like 
approximation. 

Let us first consider the simulations performed for 
the value 3/2 < 5 < 2. The situation is summarized 
in Fig. ^, in which the stationary values of the strat- 
egy concentrations are plotted as a function of p. For 
p < pf.^^*^"* = 0.1329(1), the system reaches a station- 
ary state in which the three strategies C, D and T coex- 
ist. Here it is worth mentioning, that for small p values 
{p < 0.03) the finite system can reach the absorbing state 
if the initial state has been chosen randomly. Below this 
size-dependent threshold value the three-strategy state 
can be formed and sustained by slow decreasing of p dur- 
ing the simulation. In this case the extinction of T's and 
_D's is a consequence of fluctuations (detailed later on) 
and the coexistence of the three strategies is considered 
as the real stationary state in the limit T — > oo. The sim- 
ulations performed for p > 0.005 show that cjj decreases 
linearly with p when p 0. 

Whenp^f <p< p^f^^ = 0.3671(1) only the strate- 
gies C and D survive. Finally, when p > p\^ the system 
reaches a pure C absorbing state. 

The phase diagram obtained by numerical simulations 
can be compared with the ones obtained using the ex- 
tended mean-field approximation described in Sec. p|. 

n_ 



(pair J 



At the level of pair approximation, one finds p^^ 
0.1704 and p^^^^"^^ — 0.4236, while for the square mean- 
field approximation one finds Pci^"* = 0.1482 and ^^2*'' = 
0.3980. These latest results are plotted in Fig. |. They 
compare well with the MC values given above. 

Some complementary information can be obtained by 
studying the concentration fluctuations defined as: 



Xa =T''((A^a/T''-Ca)2), {0L= C,D,T) 



(9) 



When p ^ 0, the concentration fluctuations xc and yj 
are diverging while xd remains regular as shown in Fig. 0. 
However, the sizes of the systems investigated were not 
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large enough to conclude that xc E^nd xt are diverging 
as power laws of p in the limit p —f Q. 




FIG. 9. Concentration fluctuations for the D (closed 
squares) and T (open squares) strategies for a 512 x 512 sys- 
tem. 

Moreover, the simulations suggest that for the station- 
ary state in which the three strategies coexist, the typical 
size of T domains (as well as their persistence time) is 
proportional to Similar behavior was found in the 
forest fire models introduced by Bak et al. |^ to 
study the self-organized criticality. 

The points pd and Pc2 are critical points where a sec- 
ond order nonequilibrium phase transition takes place. 
Indeed, the T concentration vanishes at pd as ct ~ 
{Pci — p)^^ , while the £)'s concentration vanishes at Pc2 
as CD ~ {Pc2 — pY'^ ■ In order to justify this behavior 
a very careful numerical analysis was performed, using 
longer sampling times in the vicinity of critical points. 
Fitting the numerical data leads to the above mentioned 
values of the critical points and Pi = (32 = 0.57(3) which 
is compatible with the directed percolation exponent as 
expected on general ground pO| . This fact is confirmed 
by the study of concentration fluctuations defined by 
Eq. (|). Sharp increases of the concentration fluctuations 
are expected at second order phase transitions. Fig. ^ il- 
lustrates this point. The concentration fluctuations of C, 
D and T strategies behave, near the transition point pd , 
as X ~ {pd —p)~^^ with 7i = 0.37(6). A similar behavior 
was found at the transition point pc2 with an exponent 
72 = 0.37(9). These values are very close to the one of 
directed percolation: k. 0.35 in two dimensions [ pO| . 

The above data suggest that p = is another critical 
point. However, for the reasons explained previously, it 
was not possible to extract reliable exponents. 

It is interesting to analyze how the three strategies 
coexist for small values of p. As an example, let us con- 
sider the snapshot of the stationary state of a system 
with p = 0.04 (see Fig. |l^). 




FIG. 10. Distribution of defectors (black boxes), coopera- 
tors (white area) and Tit for Tat strategies (open squares) in 
a system of size 100 x 100, subset of a larger system of size 
256 X 256 for p = 0.04. 

One can observe that " dark areas" (made of defectors) 
are invading the territory ("white areas") of cooperators, 
simultaneously, the " dark areas" are invaded by the T ac- 
tors (open squares). However, the domination of the T's 
is prevented by the external constraint which is leading 
to the growing of C areas within the T territory. When 
p decreases, the T territory expands while the growth of 
white areas slows down. The dark islands become sparse. 
On the contrary, an increase of the value of p accelerates 
the spreading of the C areas as well as their occupation by 
the defectors. Consequently, the D population increases 
with p and the number of T competitors decreases and 
vanishes sX p = pd ■ 

As a result of these cyclic dominant processes a self- 
organized domain structure is maintained in the system. 
Analogous spatio-temporal structures have already been 
observed by Satulowsky and Tome in a two-dimensional 
predator-prey system [|l6| and by Tainaka and Itoh when 
studying competing species |2^]. Both models belong to 
the family of spatially-extended Lotka-Volterra models 
predicting oscillatory behavior of concentrations in the 
simple mean-field approximation |25| . 

One can also analyze the evolution of the typical strat- 
egy configurations in the vicinity of the phase transi- 
tion taking place at [p = Pd)- One recognizes isolated 
colonies of the T strategies whose motions remind us of 
the branching annihilating random walk models. It is 
known that this model belongs to the DP universality 
class too 

The rate of mutual cooperation is related to the aver- 
age payoff per site. The maximum average payoff (which 
value is 4) is reached when all the players cooperate with 
their neighbors. On the other hand, the minimum aver- 
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age payoff per sites coincides witfi tfie maximum of D's 
concentration. 



1.0 




FIG. 11. Average payoff per site sites versus p in the 
two-dimensional model. The closed squares represent the MC 
data obtained for 6 = 1.83 and L — 512. The diamonds indi- 
cate the quantity 4(cc -I- ct). 

In Fig. |ll| the average payoff per site is compared with 
the quantity 4(cc + ct)- One can see that the agreement 
between these two quantities is generally reasonable and 
is even quite good for p < pci- The differences between 
the two quantities are only coming from the fights taking 
place at the boundaries D-C or D-T. However, for this 
range of values of p, the D players form clusters due to 
the presence of the T's and accordingly these fights are 
not frequent. 

Finally, let us briefly consider the case for which the 
parameter h belongs to the second possible range, 1 < 
b < 3/2. The results of the simulations are qualitatively 
similar to the case 3/2 < 6 < 2. The critical values of 
p are lower, namely, Pci = 0.112(1) and Pc2 = 0.297(1). 
The most relevant differences can be observed in the limit 
p — > 0, where the maximum concentration of the T strat- 
egy is strikingly lower (ct < 0.15) than those reported in 
the previous case (see Fig. 0). 



C. Three-dimensional system 

We now consider the case in which the players are lo- 
cated on the sites of a three-dimensional cubic lattice. 
According to the payoff matrix (see Table 1), five ranges 
of values of 6 : 1 < 6 < 2 have now to be distinguished, 
separated by the following values: b = 5/4, 4/3, 3/2, 
and 5/3. Any value of b in one of those ranges will lead 
strictly to the same behavior. Moreover it turns out that 
all the values of 6 : 1 < 6 < 2 leads qualitatively to the 
same behavior. 
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FIG. 12. Stationary state densities of D (closed squares), C 
(open circles) and T (open squares) strategies as a function of 
p as obtained by MC simulation of a system of size 80 x 80 x 80. 
The solid (D), dotted (C) and long-dashed (T) lines indicate 
the predictions of the pair approximation. 

Figure ^ shows a typical phase diagram which is qual- 
itatively similar to the one found for the two-dimensional 
model. MC simulations have been performed for systems 
of sizes 32^, 64'^ and 80'^ for five representative values of 
b exploring all the different ranges. 

When the value of p increases, the system undergoes 
two subsequent phase transitions. The critical values of 
p vary weakly with b. For example, for 3/2 < 6 < 2, we 

have obtained p^f^^ = 0.1441(1) andp^f^^ = 0.4292(5), 

while for 1 < 6 < 3/2, we found p^f^^ = 0.1512(2) and 

Pc2^*"' — 0.4130(3). As expected, the vanishing concen- 
trations behave near the critical points as a power law 
and both transitions belong to the directed percolation 
universality class. For b = 1.6, the numerical analysis 
of the MC data in the vicinity of the second transition 

gives Pc^*"'' = 0.4165(2) and (5i = 0.79(4) in good agree- 
ment with the corresponding directed percolation value 
of /3 = 0.81(2) ||. 

The analysis of the concentration fluctuations xt and 
Xd in the vicinity of the points p = Pci and p = Pci 
leads to the critical exponent 7 = 0.18(8). The large 
uncertainty is due to the small extension of the critical 
regime (typically |p — Pc| < 10""^). The above value of 7 
is compatible with the scaling law for the DP exponents. 

As far as mean-field like approximations are concerned, 
they are supposed to be better when increasing the num- 
ber of nearest neighbors (dimensions). The algebra be- 
comes soon very cumbersome therefore our analysis is 
restricted to the pair approximation. The results are 
given in Fig. ^ One can see that the approximate re- 
sults are in good agreement with the MC data but in the 
close vicinity of the critical points. Note finally that the 
p-dependence of the average payoff is qualitatively sim- 
ilar to those found for the two-dimensional model (see 
Fig. |l|). 
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VI. CONCLUSIONS 
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We have studied quantitatively the emergence of coop- 
eration in a spatially extended version of the prisoner's 
dilemma game with three possible strategies (coopera- 
tion, defection and Tit for Tat) in the absence and pres- 
ence of externally enforced cooperation. The players are 
distributed on a d-dimensional simple cubic lattice and 
their interactions are restricted to nearest neighbors. 

In the absence of external constraint the time evolution 
is controlled by a local adoption of a neighboring strat- 
egy whose introduction is motivated by the Darwinian 
selection rule. When starting the simulations with a ran- 
dom initial state made of only cooperators and defec- 
tors, one founds that, both in d — 1,2 and 3 dimensions, 
the stationary state of the system is a pure defector one. 
However, if the T strategy is also present in the initial 
state, then the stationary state is dominated by mutual 
cooperation. 

The external constraint, forcing the players to adopt 
the strategy C with probability p, has the following con- 
sequences. If only the C and D strategies are present 
in the initial state, then the external constraint enforces 
the cooperation for all value of p. However, if the three 
strategies C, D and T are initially present and if the di- 
mensionality of the system is larger than one, then the 
external constraint reduces the cooperation maintained 
by T for small values of p. The cooperation will be en- 
forced only if the constraint is large enough [p > Pci)- 
These conclusions arc reached both from the extended 
mean-field analysis and the MC simulations for 2 and 3 
dimensional systems. The general features are not af- 
fected by the value of b characterizing the temptation to 
defect. 

Our study confirms the crucial role of the T strategy 
which is able to prevent the spreading of defection. The 
T strategy, however, dyes out in the one-dimensional sys- 
tem as well as in the models for which the simple mean- 
field theory is exact. 

The above conclusions are in agreement with several 
historical facts coming both from the political or eco- 
nomical world. For example, it shows that forcing a frac- 
tion of the population to cooperate in a naive way (C 
strategy) does not improve the overall cooperation in the 
society. It is better to educate more individuals in such 
a way that they will be able to play the more sophisti- 
cated Tit for Tat strategy if one desires to improve the 
cooperation. 

From a nonequilibrium phase transition point of view, 
the above investigations have confirmed that the two sec- 
ond order phase transitions associated with the extinc- 
tion processes belong to the robust directed percolation 
universality class. 

Finally we emphasize that similar behavior is expected 
for spatially extended Lotka-Volterra like systems with 
three (or more) species providing that where one of the 
species is externally favored. 
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